Adaptive Modeling and Planning for Reactive Agents
نویسنده
چکیده
This research is concerned with problems where an agent is situated in a stochastic world without prior knowledge of the world’s dynamics. The agent must act in such a way so as to maximize its expected discounted reward over time. The state and action spaces are extremely large or infinite, and control decisions are made in continuous time. The objective of this research is to create a system capable of generating competent behavior in real time. The approach taken in my research is to incrementally learn a model that can be used for planning. Sutton (1990) and Moore & Atkeson (1993) used a similar approach to solve problems where the underlying world dynamics are modeled by a Markov decision process with finite states and actions and control decisions taking place at unit intervals. The parameters of the model are estimated from experience and dynamic programming (Bellman 1957) is used to produce optimal reactive plans. The problems considered in my research, however, cannot be modeled as a Markov decision process because they involve infinitely many states and actions with control decisions taking place in continuous time. One way to model such problems is to partition the state space into regions and treat all the states in the same region collectively. The action space may also be partitioned into regions. Actions in the same region are assumed to have the same effect on the world. Because transitions between regions of the state space are of variable duration, the world should be modeled by a semi-Markov decision process. The parameters of the model can be estimated from experience, and dynamic programming can be used for planning. The primary challenge in this work is determining how to partition the state and action space in a way that maximizes performance. Since it is not feasible to consider each possible way to partition these spaces, my system relies upon heuristics to determine how the partition should be adapted. This computational framework is called the Adaptive Modeling and Planning System (AMPS) since it refines its model based on experience and incorporates planning to incrementally improve behavior. This system is designed to be efficient and broadly applicable across domains.
منابع مشابه
An economic-statistical model for production and maintenance planning under adaptive non-central chi-square control chart
Most of the inventory control models assume that quality defect never happens, which means production process is perfect. However, in real manufacturing processes, the production process starts its operation in the in-control state; but after a period of time, shifts to the out-of-control state because of occurrence of some disturbances. In this paper, in order to approach the model to real man...
متن کاملA review of agent-based modeling (ABM) concepts and some of its main applications in management science
We live in a very complex world where we face complex phenomena such as social norms and new technologies. To deal with such phenomena, social scientists often use reductionism approach where they reduce them to some lower-lever variables and model the relationships among them through a scheme of equations. This approach that is called equation based modeling (EBM) has some basic weaknesses in ...
متن کاملAdaptive Autonomy: The Key to Dynamic, Responsive Formation of Sensible Agent Organizations
* This research is supported in part by the Texas Higher Education Coordinating Board #003658452 and the Applied Research Laboratories and Office of Naval Research Grant N00014-96-1-0298. 1. ABSTRACT The practical deployment of distributed agent-based systems mandates that each agent behave sensibly. This paper focuses on the development of flexible, responsive, adaptive systems based on Sensib...
متن کاملSensible Agents
The practical deployment of distributed agent-based systems mandates that each agent behave sensibly. This paper focuses on the development of flexible, responsive, adaptive systems based on Sensible Agents. Sensible Agents perceive, process, and respond based on an understanding of both local and system goals. Each agent is capable of (1) deliberative or reactive planning and execution of one ...
متن کاملAdaptive Neural Network Method for Consensus Tracking of High-Order Mimo Nonlinear Multi-Agent Systems
This paper is concerned with the consensus tracking problem of high order MIMO nonlinear multi-agent systems. The agents must follow a leader node in presence of unknown dynamics and uncertain external disturbances. The communication network topology of agents is assumed to be a fixed undirected graph. A distributed adaptive control method is proposed to solve the consensus problem utilizing re...
متن کاملLilies - a framework for building multiple agents for adaptive planning
Lilies (Localisation and InterLeaving stragIES) was developed to deal with a forest fire fighting planning environment. The apphcation required adaptive planning in a reactive and generative environment. To model the application, it was necessary for multiple agents to be developed which added the usual communication [15, 19] issues. The planning environment needed to deal with both reactive an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005